66 research outputs found

    MOLIERE: Automatic Biomedical Hypothesis Generation System

    Get PDF
    Hypothesis generation is becoming a crucial time-saving technique which allows biomedical researchers to quickly discover implicit connections between important concepts. Typically, these systems operate on domain-specific fractions of public medical data. MOLIERE, in contrast, utilizes information from over 24.5 million documents. At the heart of our approach lies a multi-modal and multi-relational network of biomedical objects extracted from several heterogeneous datasets from the National Center for Biotechnology Information (NCBI). These objects include but are not limited to scientific papers, keywords, genes, proteins, diseases, and diagnoses. We model hypotheses using Latent Dirichlet Allocation applied on abstracts found near shortest paths discovered within this network, and demonstrate the effectiveness of MOLIERE by performing hypothesis generation on historical data. Our network, implementation, and resulting data are all publicly available for the broad scientific community

    Moliere: Automatic Biomedical Hypothesis Generation

    Get PDF
    Medical research is expensive and risky. Drug manufacturers need to prioritize their early investments with very little experimental data in order to most efficiently discover new treatments. Moliere is a hypothesis generation system that identifies implicit yet-unknown connections already present within the body of medical literature. Using our discovery and ranking system, medical researchers can identify fruitful research directions earlier in the discovery process, before time consuming and expensive experiments. For example, we used Moliere to identify a new gene-treatment target for HIV-associated Neurodegenerative Disease, which we later confirmed in laboratory experiments

    MOLIERE: Automatic Biomedical Hypothesis Generation System

    Get PDF
    Hypothesis generation is becoming a crucial time-saving technique which allows biomedical researchers to quickly discover implicit connections between important concepts. Typically, these systems operate on domain-specific fractions of public medical data. MOLIERE, in contrast, utilizes information from over 24.5 million documents. At the heart of our approach lies a multi-modal and multi-relational network of biomedical objects extracted from several heterogeneous datasets from the National Center for Biotechnology Information (NCBI). These objects include but are not limited to scientific papers, keywords, genes, proteins, diseases, and diagnoses. We model hypotheses using Latent Dirichlet Allocation applied on abstracts found near shortest paths discovered within this network, and demonstrate the effectiveness of MOLIERE by performing hypothesis generation on historical data. Our network, implementation, and resulting data are all publicly available for the broad scientific community

    Accelerating COVID-19 research with graph mining and transformer-based learning

    Full text link
    In 2020, the White House released the, "Call to Action to the Tech Community on New Machine Readable COVID-19 Dataset," wherein artificial intelligence experts are asked to collect data and develop text mining techniques that can help the science community answer high-priority scientific questions related to COVID-19. The Allen Institute for AI and collaborators announced the availability of a rapidly growing open dataset of publications, the COVID-19 Open Research Dataset (CORD-19). As the pace of research accelerates, biomedical scientists struggle to stay current. To expedite their investigations, scientists leverage hypothesis generation systems, which can automatically inspect published papers to discover novel implicit connections. We present an automated general purpose hypothesis generation systems AGATHA-C and AGATHA-GP for COVID-19 research. The systems are based on graph-mining and the transformer model. The systems are massively validated using retrospective information rediscovery and proactive analysis involving human-in-the-loop expert analysis. Both systems achieve high-quality predictions across domains (in some domains up to 0.97% ROC AUC) in fast computational time and are released to the broad scientific community to accelerate biomedical research. In addition, by performing the domain expert curated study, we show that the systems are able to discover on-going research findings such as the relationship between COVID-19 and oxytocin hormone

    Hedgehog/Gli supports androgen signaling in androgen deprived and androgen independent prostate cancer cells

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Castration resistant prostate cancer (CRPC) develops as a consequence of hormone therapies used to deplete androgens in advanced prostate cancer patients. CRPC cells are able to grow in a low androgen environment and this is associated with anomalous activity of their endogenous androgen receptor (AR) despite the low systemic androgen levels in the patients. Therefore, the reactivated tumor cell androgen signaling pathway is thought to provide a target for control of CRPC. Previously, we reported that Hedgehog (Hh) signaling was conditionally activated by androgen deprivation in androgen sensitive prostate cancer cells and here we studied the potential for cross-talk between Hh and androgen signaling activities in androgen deprived and androgen independent (AI) prostate cancer cells.</p> <p>Results</p> <p>Treatment of a variety of androgen-deprived or AI prostate cancer cells with the Hh inhibitor, cyclopamine, resulted in dose-dependent modulation of the expression of genes that are regulated by androgen. The effect of cyclopamine on endogenous androgen-regulated gene expression in androgen deprived and AI prostate cancer cells was consistent with the suppressive effects of cyclopamine on the expression of a reporter gene (luciferase) from two different androgen-dependent promoters. Similarly, reduction of smoothened (Smo) expression with siRNA co-suppressed expression of androgen-inducible KLK2 and KLK3 in androgen deprived cells without affecting the expression of androgen receptor (AR) mRNA or protein. Cyclopamine also prevented the outgrowth of AI cells from androgen growth-dependent parental LNCaP cells and suppressed the growth of an overt AI-LNCaP variant whereas supplemental androgen (R1881) restored growth to the AI cells in the presence of cyclopamine. Conversely, overexpression of Gli1 or Gli2 in LNCaP cells enhanced AR-specific gene expression in the absence of androgen. Overexpressed Gli1/Gli2 also enabled parental LNCaP cells to grow in androgen depleted medium. AR protein co-immunoprecipitates with Gli2 protein from transfected 293T cell lysates.</p> <p>Conclusions</p> <p>Collectively, our results indicate that Hh/Gli signaling supports androgen signaling and AI growth in prostate cancer cells in a low androgen environment. The finding that Gli2 co-immunoprecipitates with AR protein suggests that an interaction between these proteins might be the basis for Hedgehog/Gli support of androgen signaling under this condition.</p

    Identifying Cancers Impacted by CDK8/19

    Get PDF
    CDK8 and CDK19 Mediator kinases are transcriptional co-regulators implicated in several types of cancer. Small-molecule CDK8/19 inhibitors have recently entered or are entering clinical trials, starting with breast cancer and acute myeloid leukemia (AML). To identify other cancers where these novel drugs may provide benefit, we queried genomic and transcriptomic databases for potential impact of CDK8, CDK19, or their binding partner CCNC. sgRNA analysis of a panel of tumor cell lines showed that most tumor types represented in the panel, except for some central nervous system tumors, were not dependent on these genes. In contrast, analysis of clinical samples for alterations in these genes revealed a high frequency of gene amplification in two highly aggressive subtypes of prostate cancer and in some cancers of the GI tract, breast, bladder, and sarcomas. Analysis of survival correlations identified a group of cancers where CDK8 expression correlated with shorter survival (notably breast, prostate, cervical cancers, and esophageal adenocarcinoma). In some cancers (AML, melanoma, ovarian, and others), such correlations were limited to samples with a below-median tumor mutation burden. These results suggest that Mediator kinases are especially important in cancers that are driven primarily by transcriptional rather than mutational changes and warrant an investigation of their role in additional cancer types
    corecore